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Abstract. In English as a Foreign Language (EFL) situations, it is important for 
educators to improve learners’ sound recognition skill due to the variation of 
English found in the world. Furthermore, perceptual skill is a foundation leading 
to intelligibility in production. This study examined the effects of using High 
Variability Phonetic Training (HVPT) in computer assisted pronunciation training 
on the recognition and production of English phonemes, which are challenging 
for Japanese learners of English. Between pre-, mid-, and post-tests, the learners 
completed training sessions three times a week in two sound environments. 
The results demonstrated improvement in recognition skill with larger effects 
immediately after training. For production skill, however, the effects were not large, 
with a mixed outcome against the improvement in perception. Further research is 
suggested under a condition in which articulation practice immediately follows 


identification of individual training items. 


Keywords: pronunciation, HVPT (high variability phonetic training), computer 


assisted pronunciation training. 


1. Introduction 


HVPT is a training method for learners to perceive L2 sounds produced by multiple 
talkers in multiple phonetic contexts, which has been applied to language programs 
for a variety of L1 due to its effectiveness and generalizability (Thomson, 2018). 
HVPT has been proven effective in helping learners to distinguish L2 sounds that 
are confusing due to their similarity to L1 sounds (Munro & Derwing, 2006). This 
perceptual training approach has also led to improvement in learner production by 
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way of better intelligibility scores (Bradlow, Akahane- Yamada, Pisoni, & Tohkura, 
1999). 


While numerous studies have examined the efficacy of HVPT for training Japanese 
listeners to perceive English /I/ and /r/ contrasts (Bradlow et al., 1999; Logan, 
Lively, & Pisoni, 1991, among many others), we are only aware of such studies 
being conducted in highly controlled phonetic laboratories. Further, with few 
exceptions, most studies have not examined whether this training transfers to 
production (Thomson, 2018). Finally, previous /I/-/r/ studies for Japanese learners 
focus on a binary distinction, which fails to recognize that English /r/-/w/ are also 
known to be confusable (Guion, Flege, Akahane-Yamada, & Pruitt, 2000). 


In this study, Thomson’s (2017) English Accent Coach 1s used because it comprises 
thirty distinct talkers for each sound in each phonetic context and has been gamified 
to make it more interesting to learners. The research questions are: 


¢ What are the effects of HVPT on perception of English /I/-/r/-w/ contrasts 
over time in different phonetic environments? 


¢ What are the effects of HVPT on production of the same sounds over time 
in different phonetic environments? 


¢ What is the relationship between perception and production? 


2. Method 


2A. Participants 


The learners who agreed to participate in this research were freshman non-English 
majors in a university in Tokyo. They were enrolled in compulsory English courses 
consisting of two classes: Class A and Class B. By eliminating those who scored 
100% on the pre-test and those who could not take all the tests, 30 students were 
eligible for data analysis: Class A (n=13; four males and nine females) and Class B 
(n=17; 11 males and six females). According to their TOEIC? listening scores (Class 
A, M=363.5, SD=48.5; Class B, M=277.2, SD=56.6), their English proficiency 
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could be categorized as B1 in CEFR‘ levels based on the score bands provided by 
the test provider, the Institute for International Business Communication. 


2.2. Treatment 


A pre-test and post-test design was adopted for a ten-week treatment period during 
the fall semester in 2017. During training, the target sounds were presented either 
in Consonant + Vowel (CV) environments or Consonant + Vowel + Consonant 
(CVC) environments, while the test stimuli utilized 100 CV items consisting of the 
three target consonants randomly followed by a vowel, such as /li/, /ru/ or /wa/. The 
sound combinations were also randomized as were the thirty talkers’ stimuli. Mid- 
tests were conducted after five weeks only for perception. In the first and the tenth 
week, the participants’ production was recorded by having them produce target 
items in the carrier phrase: “Now I say .” (Thomson, 2012). 


Training comprised three 200-item perceptual training sessions per week. Over 
the ten weeks, Class A learners were trained to perceive the English consonants 
in syllable-onset position in CV frames for the first five weeks, up to mid-test, 
followed by CVC frames for another five weeks and post-test. Class B was trained 
in the opposite order. In each of the classes they practiced first round of training 
in a week and assigned to do the rest during the week. They submitted three PDF 
feedback forms through Sakai, a course management system, every week. The 
researcher asked them to complete only one training session on a given day (_.e. 
they could not do multiple sessions back-to-back). 


3. Results and discussion 


3.1. RQ1: effects of HVPT on recognition over time 


The means of the tests for the two classes exhibited medium and large effect sizes 
between pre- and post-tests (CV : Cohen’s d=.78; CVC d=.58, Table 1). In both 
of the phonetic environments, HVPT training showed immediate positive effects 
and persistence (CV: pre- to mid-test in Class A, d=.46 ; mid- to post-test in Class 
B, d=.74). These results seem to be in accordance with the results of Logan et al. 
(1991) in that the linguistic environments for HVPT training makes a difference. In 
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addition, the CVC environment showed higher average scores than CV. The CVC 
stimuli, even non-words, may have sounded more word-like than the CVs. 


Table 1. Mean of correct percentages (SD) in perception tests of CV and CVC 


environments 
CV tests % CVC tests % 
Pre Mid Post Mid Post 
Class A 73.2 (17.5) | 83.2(12.6) 83.7 (10.9) | 87.3 (7.3) 96.9 (2.5) 
Class B 68.6(10.7) |73.4(11.4) 84.5(9.7) | 94.6(3.2) 94.1 (2.9) 
Total 70.6 (14.1) 77.8 (12.7) 84.1 (10.1) 91.3 (6.1) 95.3 (3.2) 


Among the three target phonemes, identification of /r/ was the lowest (50%), 
followed by /I/ (64%) and /w/ (90%) in the CV pre-test. However, the largest 
progress was made immediately after the training (30% in both environments). In 
comparison, /l/ made a maximum progress of 19% in CV and 12% in CVC. The 
sound of /w/ had high scores in the beginning (90%) and reached the ceiling (99%) 
over a short period. 


3.2. RQ2: effects of HVPT on production over time 


Approximately 13% increase was observed in CV and CVC production tests 
between pre- and post-tests (d=.44). Particularly, Class B showed larger 
progress both in CV and CVC environments with more than a 20% increase 
(Table 2). This positive transfer follows the findings of Bradlow et al. (1999) 
that perception-only training improves production. HVPT may have exerted 
more influence on learners at an intermediate level of L2 English rather than 
those at a higher level. 


It was also found that the order of sound difficulty was the same as for perception. 
In addition, larger progress was observed in /I/ and /r/ in CVC than in CV. These 
similarities to perception may represent the distance between their L1 and L2 
(Munro & Derwing, 2006). 


Table 2. Mean of correct percentages (SD) in production tests of CV and CVC 


environments 

CV test % CVC test % 

Tl T3 Tl T3 
Class A | 59.8 (15.4) 62.4 (22.3) 65.4 (13.1) 65.0 (22.6) 
Class B | 53.5 (35.6) 74.3 (23.2) 54.5 (22.1) 78.1 (17.5) 
Total 56.3 (33.7) 69.0 (24.1) 59.4 (20.0) 72.2 (22.6) 
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3.3. RQ3: relationship between perception and production 


Pearson’s correlation coefficients between the recognition test gain and the 
production test gain were r=-.20 in CV, and r=-.43 in CVC. The results indicate 
the progress in production is not necessarily made by the participants who made 
progress in recognition. This gap may come from the EFL situation where the 
learners had limited opportunities of oral communication outside the classroom, 
yet perceptual foundation prepares learners for production. 


4. Conclusion 


This study found positive effects of HVPT in computer assisted pronunciation 
training on perception to a large degree, but a small degree on production. Despite 
the gap, it is significant for EFL learners to develop a robust acoustic image to be 
drawn on for production. In this sense, HVPT realized by English Accent Coach has 
a strong potential to change the paradigm of pronunciation learning and teaching in 
EFL environments. 
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